712 Assignment 2

Author: Marah Shahin
ID: 500055421
UPI: msha846

Description of Activities

Activity 1: Swimming

The first activity was swimming. The lane was 25 metres in length and the swimming style was backstroke. The phone was mounted on the left arm facing leftward from the subject (figure below). The phone was secured vertically and the subject experienced little movement from the phone throughout the activity.

Activity 2: Football

The next activity was football. Specifically, the subject was passing a ball back and forth for just over 2 minutes. The phone was mounted in the same location as that of the first activity to avoid unecessary differences between data collections. A snapshot of the activity and phone can be seen in the figure below.

Plot Comments

It's clear football results in an overall greater variance of magnitude throughout the activity. It's also interesting that the transition period for swimming is predominately greater than the football activity signal (possibly due to getting out of the pool, getting the band off, etc). The transition period for football is of a much smaller magnitude (a person ended the recording for me). Nevertheless, these periods are clear and will be removed in the next steps. Just manually, the signals vary in magnitude and patterns. The signals are likely to result in a solid, reliable classifying model.

Remove Transition Periods

Length of signal after transition period was removed

The signals are now 142 (first sample is at 0.4s) and 135 (141-6) seconds for swimming and football activites, respectively.

Add windows

Number of windows in each activity

There are 141 complete (with 100 rows) windows for swimming and 135 windows for football.

Extract Features

Feature Matrix Attributes

The X feature matrix has a shape of 276 rows × 3136 columns with 3136 features extracted as per the number of columns in the 'X' dataframe above

Only choose most important features

Repeated Cross-validation

Scores do not vary with changes in test and training sets. Hence, the model does not appear overfit.

The score is very close to 1.0 therefore, this is a very good classifier.

Visualising Important Features

Pair plot comments

The orange and blue dots are reasonably distinct in the majority of the plots. This could explain why the model does well in distinguishing between the two signals.

Extract minimal set of features

visualising some common features

Pair plot comments

The orange and blue dots are reasonably distinct in the majority of the plots. This could explain why the model does well in distinguishing between the two signals.